Rank | Count | Beginning |
---|---|---|
82931 | 26104 | Die |
47705 | 13030 | Das |
68414 | 12492 | Der |
175781 | 6908 | In |
260346 | 5582 | Und |
136067 | 5539 | Es |
170614 | 4551 | Im |
241195 | 4252 | Sie |
129887 | 4223 | Er |
17832 | 4193 | Auch |
164591 | 4144 | Ich |
204996 | 3833 | Mit |
120768 | 3569 | Ein |
147683 | 3140 | Für |
432 | 3125 | Aber |
28130 | 2639 | Bei |
246342 | 2589 | So |
210809 | 2533 | Nach |
121595 | 2496 | Eine |
8398 | 2352 | Als |
288623 | 2235 | Wir |
11346 | 2167 | Am |
281312 | 2097 | Wenn |
103693 | 2077 | Diese |
285595 | 2060 | Wie |
115043 | 1963 | Doch |
22114 | 1769 | Auf |
39367 | 1675 | Da |
65868 | 1656 | Denn |
277088 | 1653 | Was |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV